Abstract
As part of an iniciative to explore teaching strategies to increase the exposure to and understanding of reproducible research in the area of GIScience, we conducted an intervention with students enrolled in two GIS-related master’s programmes: Erasmus Mundus MSc in Geospatial Technologies and MSc Geomatics during the academic year 2019/20. This notebook analyses participants’ responses to two questionnaires (pre-test and post-test) on their previous knowledge of reproducible research concepts and practices, and the self-assessment exercise on the reproducibility level of the master’s theses. Results were presented in the Research Reproducibility 2020 conference (see (Granell, Sileryte, and Nüst 2020)).
This document does not install the required R packages by default. You can run the script install.R to install all required dependencies on a new R installation, or use install.packages(..) to install missing R packages.
library(tidyverse)
library(forcats)
library(kableExtra)
library(here)
library(stringr)
library(googlesheets4)
library(likert)
library(magick)
library(patchwork)
library(ggthemes)
library(scales)
library(gridExtra)
To create the PDF of the computational notebook you can run the following commands in a new R session. If you have problems rendering the PDF you can execute each chunk independently in RStudio.
require("knitr")
require("rmarkdown")
rmarkdown::render("self-assessment-experiment.Rmd", output_format = "pdf_document")
Eligible participants were initially all students enrolled in the Master Thesis Project in the academic year 2019/2020 from two GIS-related master programmes: Erasmus Mundus MSc in Geospatial Technologies and MSc Geomatics. The former is offered by three universities (UJI, NOVA IMS Lisboa and Münster), the latter by TU Delft. As participation was voluntary, the experiment was not counted towards the final grade of the master’s thesis.
Students who also answered the questionnaires but were not enrolled in the master programmes above (e.g., PhD students who participated in a doctoral course at UJI) were removed from the analysis. Fields that refer to personal information (email, thesis handlers, url to pdf files) were also removed and therefore not used in the analysis.
A brief summary of the steps followed are described below. Additional notes are also available.
Step 1. Participants answered the first questionnaire (pre-test assessment) at the start of the semester (or master thesis project/course).
Step 2. In the OSF web site, participants were provided with an introductory 5-minute video lecture to the initiative and 20-minute video lecture to the topic of reproducibility.
Step 3. In the same web site, participants were also given 3 self-study, self-paced assignments (in PDF format) to introduce them to the methods and tools for reproducibility research. To support them, we also set up a GitHub repo as a discussion service to allow all students to discuss questions with regard to the assignments and the reproducibility self-assessment of their theses.
Step 4. Participants were asked to self-assess the level of reproducibility of their master theses, as explained in the first video lecture. In practice, it meant to add a simple statement (last line) to the abstract indicating the level (score) of each of the five criteria (Nüst et al. 2018).
Step 5. At the end of the semester (or master thesis project/course), participants answered a second questionnaire (post-test assessment).
The list of questions of each questionnaire were discussed and agreed in a shared document.
Questionnaire #1 (pre-test) is divided in three sections:
Questionnaire #2 (post-test) is divided in two sections:
Section 1 in both questionnaires is pretty much the same, as the list of 15 terms are asked at the start and end of the semester.
Training materials developed for the experiment are publicly available in the OSF project “Reproducibility of GIScience MSc Theses - Self-assessment experiment”:
The master’s theses of the students of the Erasmus Mundus MSc in Geospatial Technologies are openly available in an institutional repository. Most of the participating students added the self-assessment sentence in the specified format.
The master’s theses of the students of the TU Delft’s MSc Geomatics are openly available in an instituational repository. Here, none of the students added the self-assessment sentence as requested. However, some of them added an appendix to their theses to reflect on the reproducibility of the thesis in a narrative way.
The following plots and tables are based on the files pretest_filtered.csv and postest_filtered.csv in the folder data. Only students from MSc in Geospatial Technologies (UJI, NOVA IMS, IFGI-WWU) and MSc Geomatics (TU Delft) are analysed.
25 students did eventually deliver the master thesis project.
48% (12/25) of students answered the first questionnaire. All but one added the self-assessment statement to the thesis manuscript.
52% (13/25) of students added a self-assessment statement to the master thesis. Two of them did not answered the questionnaire #1, so they are not included in the analysis.
2 (out of 12) students answered that they have been previously trained on reproducible research practices.
Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?
| School of thought | Term | low (%) | neutral (%) | high (%) | mean | sd |
|---|---|---|---|---|---|---|
| democratic | Open source | 8.33 | 0.00 | 91.67 | 4.58 | 0.90 |
| democratic | Open data | 16.67 | 0.00 | 83.33 | 4.25 | 1.14 |
| democratic | Open access | 16.67 | 8.33 | 75.00 | 4.00 | 1.13 |
| democratic | License | 0.00 | 33.33 | 66.67 | 4.17 | 0.94 |
| infrastructure | Data repositories | 33.33 | 0.00 | 66.67 | 3.50 | 1.57 |
| democratic | Intellectual property rights | 8.33 | 33.33 | 58.33 | 3.67 | 0.89 |
| democratic | Data/code versioning | 16.67 | 25.00 | 58.33 | 3.58 | 1.00 |
| pragmatic | Execution environments | 33.33 | 16.67 | 50.00 | 3.00 | 1.21 |
| infrastructure | Code repositories | 33.33 | 16.67 | 50.00 | 3.42 | 1.44 |
| public | Citizen science | 25.00 | 33.33 | 41.67 | 2.92 | 1.24 |
| pragmatic | Digital notebooks | 33.33 | 25.00 | 41.67 | 3.17 | 1.27 |
| pragmatic | Reproducible packages | 50.00 | 8.33 | 41.67 | 2.83 | 1.34 |
| infrastructure | Collaborative coding repositories | 41.67 | 25.00 | 33.33 | 2.92 | 1.38 |
| public | Science dissemination | 33.33 | 33.33 | 33.33 | 2.75 | 1.22 |
| pragmatic | Analytical workflows | 58.33 | 16.67 | 25.00 | 2.42 | 1.16 |
| infrastructure | Containers platforms | 41.67 | 33.33 | 25.00 | 2.67 | 1.30 |
| public | Science blogging | 25.00 | 50.00 | 25.00 | 2.92 | 1.16 |
| pragmatic | Computational essays | 41.67 | 41.67 | 16.67 | 2.58 | 1.24 |
Table above is the raw data to create the likert-scale plot in next Figure. Students had previous knowledge (yellowish answers >= 50%) for the 9 top terms, which are Open source, Open data, Open access, License, Data repositories, Intellectual property rights, Data/code versioning, Execution environments, Code repositories. The next term, Citizen science, is to some extent knowledgeable, as 1/3 opted for the neutral response. The rest of terms, Digital notebooks, Reproducible packages, Collaborative coding repositories, Science dissemination, Analytical workflows, Containers platforms, Science blogging, Computational essays, are in general poorly known and/or understood. This suggests that terms closely connected to reproducibility (mostly from the pragmatic School of Thought, see (Fecher and Friesike 2014)) were in general unknowledgeable (reddish answers predominate).
Questions related to previous experience in reproducibility, main difficulties and perceived importance
| Have you ever experienced DIFFICULTIES IN REUSING somebody else’s code / data? | Have you ever had to REUSE YOUR OWN past code/data? | If you answered YES in either of the last two questions, please explain which were the MAIN DIFFICULTIES you experienced. If NO, skip question | Do you know where to look for HELP and extra information to make your research reproducible? | Please rate the perceived IMPORTANCE of doing your research in a reproducible way | According to you, why is it IMPORTANT (or not) to do your research in a reproducible way? |
|---|---|---|---|---|---|
| YES | YES | Recalling the work flow of code was difficult especially when i used multiple classes and libraries. Code comments do help but only for limited code. Proper documentation was missing for my codes. | no | 3 | Reproducibility is good in most cases but there are some research projects that should be not so open due to confidentiality and also to protect interests of researchers. If the research result is a kind of product then openness and reproducibility will give no benifit to the researcher. |
| YES | YES | To get the idea of what is done in each step of the workflow. | A bit. | 4 | If the work is reproducible, the value is significantly higher as it can be used without much more work by others. This enables improvements of the idea and process. |
| YES | YES | Problem of using static code in dynamic code. | No. | 5 | To pursue with research works and participate in conferences. |
| NO | NO |
|
No, i dont know | 5 | Ability to obtain similar results on research objective or questions independent on the study or even experiment |
| YES | YES | Most time consuming part is understanding others code if they are not properly commented. Even in my code sometimes my own comments are tough to recall later. | No | 5 | Re-usability is one of the key step to move forward research words. Ease to understand other words in a scientific manner will help definitely. |
| YES | YES | To underestand the flow an the logic of the code | No | 4 | A research that can not be reproducible, did not have value |
| YES | YES | Code was structured to a certain tool and tool was not open. Data was not available in later time. | Only internet searches. | 5 | Reproducibility is essential because the scientific results are affected by data structure, data content, processing method and presentation methods. There are more variables and reproducibility enhances proper understanding and control of research. |
| YES | NO |
|
Not really | 5 | It would definitely be helpful, If I or someone else would like to replicate or continue this work in different area |
| YES | YES | Documentation | No | 4 | It is important so as to verify the scientific workflow but not always possible due to restrictions in data. |
| YES | YES | Lack of README | Not really | 5 | not reproducible, means useless, and we don’t know what’s wrong if we run it again |
| YES | YES | not well documented and function parameter some time confused me even if I use my past codes | NO | 4 | it can easy and fast the productivity |
| YES | YES | The data itself and the pre-processing of the data before the analysis. | Haven’t really tried, but I try to keep all well documented. | 5 | Because so many people is doing the same thing over and over, and for a person that is learning is better to just help to understand what you did. |
The plots below refer to the questions above with numeric/logical answers.
A large majority of students encountered difficulties trying to reproduce their own code or someone else’s code (A and B). The third column in the table above shows the problems the students faced. In general students gave importance to reproducibility research practices (C). As students had the chance to reproduce others’ works, reproducibility practices seem to be key for them; however, they find barriers to put these practices into practice.
| Practice group - Survey question | |
|---|---|
| analysis | What tools do you plan to use to ANALYSE data? |
| visualisation | What tools do you plan to use to VISUALISE/PLOT data? |
| writing | What tools do you plan to WRITE UP your master thesis or conference/journal article? |
| workflow | What is your (expected) process of GETTING the summary data, statistical results, figures, maps and tables IN your master thesis document (or conference/journal article)? |
| What BACHELOR DEGREE (or equivalent) did you have when applying for the master programme? | How many years of PROFESSIONAL EXPERIENCE did you have when applying for the master programme? | In which context are you DEVELOPING your master thesis? | Please select which sentence better describes the PLAN you have after the completion of your Master thesis |
|---|---|---|---|
| Computational Physics | Over 2 years | At the university | Not sure yet |
| Bachelor of engineering | None | At the university | Not sure yet |
| Information Technology(Engineering) | Up to 2 years | At the university | Continue with doctoral studies (or another master degree) |
| BCS in Computer science | Over 2 years | At the university | Continuing in my previous job (teaching GIS in university) |
| Bachelor in Computer Science and Engineering. | Over 2 years | At the university | Find a job in academia (researcher, technician, etc) or in education-related institutions (teacher, etc) |
| Geography | None | As internship in industry | Find a job in academia (researcher, technician, etc) or in education-related institutions (teacher, etc) |
| Bachelor in Geomatics Engineering | Over 2 years | At the university | Find a job in government agencies or institutions |
| Bachelor in Geomatics Engineering | Over 2 years | At the university | Have to continue previous job (Government ) for some period |
| Urban and Regional Planning | None | At the university | Continue with doctoral studies (or another master degree) |
| Computer Science | Over 2 years | At the university | Find a job in industry (or set up own company) |
| Civil engineering | Over 2 years | At the university | Continue with doctoral studies (or another master degree) |
| Natural Renewable Resource Engeneering | Up to 2 years | At the university | Find a job in industry (or set up own company) |
28% (7/25) of students answered the second questionnaire. All added the self-assessment statement to the thesis manuscript.
Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?
| School of thought | Term | low (%) | neutral (%) | high (%) | mean | sd |
|---|---|---|---|---|---|---|
| democratic | Open access | 0.00 | 0.00 | 100.00 | 4.86 | 0.38 |
| democratic | Open data | 0.00 | 14.29 | 85.71 | 4.57 | 0.79 |
| infrastructure | Code repositories | 0.00 | 14.29 | 85.71 | 4.57 | 0.79 |
| democratic | Open source | 0.00 | 14.29 | 85.71 | 4.71 | 0.76 |
| pragmatic | Reproducible packages | 0.00 | 14.29 | 85.71 | 4.29 | 0.76 |
| infrastructure | Data repositories | 0.00 | 14.29 | 85.71 | 4.43 | 0.79 |
| pragmatic | Analytical workflows | 28.57 | 0.00 | 71.43 | 3.71 | 1.25 |
| infrastructure | Collaborative coding repositories | 28.57 | 0.00 | 71.43 | 3.71 | 1.25 |
| democratic | Intellectual property rights | 0.00 | 42.86 | 57.14 | 3.71 | 0.76 |
| pragmatic | Digital notebooks | 42.86 | 0.00 | 57.14 | 3.43 | 1.40 |
| pragmatic | Execution environments | 14.29 | 28.57 | 57.14 | 3.86 | 1.21 |
| public | Science blogging | 28.57 | 14.29 | 57.14 | 3.43 | 1.13 |
| public | Science dissemination | 28.57 | 14.29 | 57.14 | 3.14 | 1.57 |
| democratic | Data/code versioning | 0.00 | 57.14 | 42.86 | 3.71 | 0.95 |
| public | Citizen science | 42.86 | 14.29 | 42.86 | 3.00 | 1.41 |
| democratic | License | 14.29 | 57.14 | 28.57 | 3.43 | 1.13 |
| pragmatic | Computational essays | 42.86 | 28.57 | 28.57 | 3.00 | 1.53 |
| infrastructure | Containers platforms | 42.86 | 28.57 | 28.57 | 2.71 | 1.50 |
Table above is the raw data to create the likert-scale plot in next Figure.
Have you read/watched the available materials (slides, videos, additional papers, etc)?
| Are you going to use/adapt reproducible research practice(s) in your current and/or future research projects (master thesis, doctoral thesis, etc.)? | Are you planning to learn more about reproducible research on your own? | How important are reproducibility practices for your future professional career (academia, industry, government, etc.)? |
|---|---|---|
| Not now, but I will definitively use/adapt them in the future | 4 | 4 |
| Yes, I am going to use/adapt them from now | 4 | 5 |
| Yes, I am going to use/adapt them from now | 5 | 5 |
| Not now, but I will definitively use/adapt them in the future | 5 | 4 |
| Yes, I am going to use/adapt them from now | 4 | 4 |
| Yes, I am going to use/adapt them from now | 5 | 5 |
| Not now, but maybe I will explore them in the future | 2 | 4 |
The plots below refer to the questions above.
Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?
Future work: Determine statistically change between pre-test and post-test likert questions using Wilcoxon signed-rank test
Example of self-assessment statement included in a thesis abstract
The distribution of the reproducibility levels by criteria contrasts notably with the results obtained for the evaluation of the AGILE/GIScience papers (Nüst et al. 2018). Surprisingly, all the criteria achieve level 3, the highest (ideal) reproducibility level. NA level count (1 per criterion) corresponds to the student who answered the questionnaire but did not do the self-assessment. So, NA’s are not relevant for the analysis; none of the students who did the self-assessment assigned NA values to the criteria. Indeed, the lowest score for any criteria was 1 which sets the bar quite high. Without a reproducibility check by a third person, these self-assessment values seem to be quite inflated, well above the standards level we found in academic publications. This can be interpreted at least in two ways. Or master theses were of outstanding quality with regards reproducibility; or, the most credible interpretation, students did not understand well the reproducibility levels and criteria and failed to do an objective self-evaluation.
30 students did eventually deliver the master thesis project.
47% (14/30) of students answered the first questionnaire.
1 (out of 14) students answered that they have been previously trained on reproducible research practices.
Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?
| School of thought | Term | low (%) | neutral (%) | high (%) | mean | sd |
|---|---|---|---|---|---|---|
| infrastructure | Code repositories | 0.00 | 7.14 | 92.86 | 4.64 | 0.63 |
| democratic | Open access | 0.00 | 14.29 | 85.71 | 4.50 | 0.76 |
| democratic | Open source | 0.00 | 14.29 | 85.71 | 4.50 | 0.76 |
| infrastructure | Collaborative coding repositories | 14.29 | 0.00 | 85.71 | 4.07 | 1.21 |
| infrastructure | Data repositories | 0.00 | 14.29 | 85.71 | 4.29 | 0.73 |
| democratic | Open data | 0.00 | 21.43 | 78.57 | 4.36 | 0.84 |
| democratic | Data/code versioning | 0.00 | 21.43 | 78.57 | 4.36 | 0.84 |
| democratic | License | 0.00 | 28.57 | 71.43 | 4.00 | 0.78 |
| democratic | Intellectual property rights | 21.43 | 28.57 | 50.00 | 3.50 | 1.45 |
| pragmatic | Analytical workflows | 35.71 | 28.57 | 35.71 | 3.07 | 1.38 |
| pragmatic | Digital notebooks | 50.00 | 28.57 | 21.43 | 2.64 | 1.50 |
| pragmatic | Execution environments | 42.86 | 35.71 | 21.43 | 2.64 | 1.39 |
| public | Science blogging | 57.14 | 21.43 | 21.43 | 2.36 | 1.34 |
| infrastructure | Containers platforms | 57.14 | 28.57 | 14.29 | 2.14 | 1.17 |
| public | Citizen science | 57.14 | 28.57 | 14.29 | 2.14 | 1.17 |
| public | Science dissemination | 71.43 | 14.29 | 14.29 | 2.14 | 1.23 |
| pragmatic | Reproducible packages | 57.14 | 35.71 | 7.14 | 2.14 | 1.23 |
| pragmatic | Computational essays | 71.43 | 28.57 | 0.00 | 1.64 | 0.93 |
Table above is the raw data to create the likert-scale plot in next Figure.
Questions related to previous experience in reproducibility, main difficulties and perceived importance
| Have you ever experienced DIFFICULTIES IN REUSING somebody else’s code / data? | Have you ever had to REUSE YOUR OWN past code/data? | If you answered YES in either of the last two questions, please explain which were the MAIN DIFFICULTIES you experienced. If NO, skip question | Do you know where to look for HELP and extra information to make your research reproducible? | Please rate the perceived IMPORTANCE of doing your research in a reproducible way | According to you, why is it IMPORTANT (or not) to do your research in a reproducible way? |
|---|---|---|---|---|---|
| YES | YES | Lack of commenting and other variable names than I would choose made re-use difficult for me. Sometimes you do not know what kind of thing a function returns, is it a list or a dictionary, this influences how you use the output. | Yes, now that we have this course I can look it up here. | 3 | I think it is important to some degree that when it matters it can be verified if the results of a research are correct, however, I dont think you should over do it or at least keep this separate from the actual work because it might cause information overload. I can also imagine that if you do research with users and actual people, reproducibility might be more inconsistent and can even lead to privacy issues. |
| YES | YES | the environment of my system is often different from the original code | NO | 4 | That it can be used by other researchers as well |
| YES | YES | Dependency problems, lack of comments in the code etc | No | 5 | So the results can be validated by others |
| YES | YES | Hard to configure the execution environments with a C++ project in GitHub. | No. | 5 | Avoid re-invent of the wheels and others can continue with your work. |
| YES | YES | difficult to find out what the required input formats are for code to work and difficult to see which hardcoded elements I need to change | NO | 4 | It is important so that if I made mistakes in my research people can see this instead of that they just believe that all my conclusions are true. |
| YES | YES | The main problem often is to make the data/code suitable for your problem. With geographic data I experienced problems with the coordinate reference systems used (sometimes not well documented, reprojection difficult). With own code I’ve not really experienced big problems. | I have never looked this up before. I would look things up on Google. | 4 | It is extremely useful if other people can use your research and if it’s well documented. If other people can understand and use your research, they might even find ‘problems’ or improvements, which will benefit the research even more. |
| YES | YES | No sufficient documentation on what the code actually does/means | Documentation/Github | 5 | Validity of the research can be tested |
| YES | YES | understanding and finding for which case, datasets | no | 5 | We speak of knowledge when methods are always leading to the same results under the same circumstances. This could only be seen if the research is reproducible. |
| YES | YES | understanding and finding for which case, datasets | no | 5 | We speak of knowledge when methods are always leading to the same results under the same circumstances. This could only be seen if the research is reproducible. |
| YES | YES | To reuse someone’s code I need to understand it and then check the possibility to use it (which license used to publish it). | No | 4 | Important: to have defined clearly your topic, then to describe the methodology and the implemented algorithms and present and clarify your results |
| YES | YES | To reuse someone’s code I need to understand it and then check the possibility to use it (which license used to publish it). | No | 4 | Important: to have defined clearly your topic, then to describe the methodology and the implemented algorithms and present and clarify your results |
| NO | YES | Too messy/unstructured. Comments were generally fine. | No. | 5 | It can proof that your research has been conducted in a proper way. Also, others can note mistakes. Or it can be useful for further research. |
| YES | YES |
|
Normally I look on google whenever i need any information about better structuring my code and mainly follow github style for major codes. But not explicitly | 4 | I would say it is important to do research in reproducible way to get feedback from community and improve the current status by additional feedback. But also it happens that it is difficult to do somethings in reproducible manner due to shortage of time. But I think the way Geomatics taught us coding and research is good way to reproduce because I was able to use my previous code lot of times without problems. |
| YES | YES | The biggest issue of reusing code of others is sometimes because of the version difference of used software and libraries, it’s quite hard to configure the environment correctly. | yes | 5 | Because doing research in a reproducible way can not only help others who work on similar topics, but also help the future work of our own. |
The plots below refer to the questions above wih numeric/logical answers.
| Practice group - Survey question | |
|---|---|
| analysis | What tools do you plan to use to ANALYSE data? |
| visualisation | What tools do you plan to use to VISUALISE/PLOT data? |
| writing | What tools do you plan to WRITE UP your master thesis or conference/journal article? |
| workflow | What is your (expected) process of GETTING the summary data, statistical results, figures, maps and tables IN your master thesis document (or conference/journal article)? |
| What BACHELOR DEGREE (or equivalent) did you have when applying for the master programme? | How many years of PROFESSIONAL EXPERIENCE did you have when applying for the master programme? | In which context are you DEVELOPING your master thesis? | Please select which sentence better describes the PLAN you have after the completion of your Master thesis |
|---|---|---|---|
| Maritime Engineering | None | At the university | Find a job in industry (or set up own company) |
| two bachelor degrees: (IT-Engineering & Business Consulting) & Geography | Up to 1 year | As internship in industry | Continue with doctoral studies (or another master degree) |
| M. Eng. Rural and Surveying Engineering | Up to 1 year | As internship in industry | Not sure yet |
| Geomatics | None | At the university | Continue with doctoral studies (or another master degree) |
| Future Planet Studies | Up to 6 months | As internship in industry | Find a job in industry (or set up own company) |
| Computer Science | Up to 1 year | At the university | Not sure yet |
| Architecture for the Built Enviornment | None | As internship in industry | Not sure yet |
| earth and economics | None | As internship in industry | Find a job in government agencies or institutions |
| earth and economics | None | As internship in industry | Find a job in government agencies or institutions |
| Integrated Master of Rural and Surveying Engineering | Up to 1 year | As internship in industry | Either in institutions or in industry |
| Integrated Master of Rural and Surveying Engineering | Up to 1 year | As internship in industry | Either in institutions or in industry |
| Human Geography | None | At the university | Rest (take a gap year or similar) |
| Agriculture and Food Engineering | Over 2 years | At the university | Continue with doctoral studies (or another master degree) |
| Remote Sensing | Over 2 years | At the university | Find a job in industry (or set up own company) |
10% (3/30) of students answered the second questionnaire.
Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?
| School of thought | Term | low (%) | neutral (%) | high (%) | mean | sd |
|---|---|---|---|---|---|---|
| pragmatic | Execution environments | 0.00 | 0.00 | 100.00 | 4.00 | 0.00 |
| public | Science blogging | 0.00 | 0.00 | 100.00 | 4.00 | 0.00 |
| democratic | Open access | 0.00 | 0.00 | 100.00 | 4.33 | 0.58 |
| democratic | Open data | 0.00 | 0.00 | 100.00 | 4.67 | 0.58 |
| democratic | Open source | 0.00 | 0.00 | 100.00 | 4.67 | 0.58 |
| democratic | Data/code versioning | 0.00 | 0.00 | 100.00 | 4.67 | 0.58 |
| democratic | License | 0.00 | 0.00 | 100.00 | 4.33 | 0.58 |
| infrastructure | Collaborative coding repositories | 0.00 | 0.00 | 100.00 | 4.33 | 0.58 |
| infrastructure | Code repositories | 0.00 | 0.00 | 100.00 | 4.67 | 0.58 |
| infrastructure | Data repositories | 0.00 | 0.00 | 100.00 | 4.67 | 0.58 |
| democratic | Intellectual property rights | 0.00 | 33.33 | 66.67 | 4.00 | 1.00 |
| pragmatic | Computational essays | 66.67 | 0.00 | 33.33 | 2.33 | 1.53 |
| public | Citizen science | 33.33 | 33.33 | 33.33 | 3.00 | 1.00 |
| pragmatic | Digital notebooks | 33.33 | 66.67 | 0.00 | 2.67 | 0.58 |
| pragmatic | Analytical workflows | 33.33 | 66.67 | 0.00 | 2.67 | 0.58 |
| pragmatic | Reproducible packages | 33.33 | 66.67 | 0.00 | 2.67 | 0.58 |
| infrastructure | Containers platforms | 33.33 | 66.67 | 0.00 | 2.33 | 1.15 |
| public | Science dissemination | 0.00 | 100.00 | 0.00 | 3.00 | 0.00 |
Table above is the raw data to create the likert-scale plot in next Figure.
Have you read/watched the available materials (slides, videos, additional papers,etc)?
| Are you going to use/adapt reproducible research practice(s) in your current and/or future research projects (master thesis, doctoral thesis, etc.)? | Are you planning to learn more about reproducible research on your own? | How important are reproducibility practices for your future professional career (academia, industry, government, etc.)? |
|---|---|---|
| Yes, I am going to use/adapt them from now | 2 | 3 |
| Yes, I am going to use/adapt them from now | 4 | 5 |
| Not now, but maybe I will explore them in the future | 3 | 4 |
The plots below refer to the questions above.
Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?
It does not make sense to determine statistically change between pre-test and post-test likert questions because of the big difference between the number of responses per questionnaire: 14 and 3, respectively.
As said previously, TU Delft students did not provide the self-assessment statement. Yet, some of them added a 1-page reflection to their thesis manuscripts. TODO: to be analyse